SwePub
Tyck till om SwePub Sök här!
Sök i SwePub databas

  Utökad sökning

Träfflista för sökning "db:Swepub ;pers:(Lu Zhonghai);pers:(Zhao Xueqian)"

Sökning: db:Swepub > Lu Zhonghai > Zhao Xueqian

  • Resultat 1-9 av 9
Sortera/gruppera träfflistan
   
NumreringReferensOmslagsbildHitta
1.
  • Chen, X., et al. (författare)
  • Achieving memory access equalization via round-trip routing latency prediction in 3D many-core NoCs
  • 2015
  • Ingår i: Proceedings of IEEE Computer Society Annual Symposium on VLSI, ISVLSI. - : IEEE. ; , s. 398-403
  • Konferensbidrag (refereegranskat)abstract
    • 3D many-core NoCs are emerging architectures for future high-performance single chips due to its integration of many processor cores and memories by stacking multiple layers. In such architecture, because processor cores and memories reside in different locations (center, corner, edge, etc.), memory accesses behave differently due to their different communication distances, and the performance (latency) gap of different memory accesses becomes larger as the network size is scaled up. This phenomenon may lead to very high latencies suffered from by some memory accesses, thus degrading the system performance. To achieve high performance, it is crucial to reduce the number of memory accesses with very high latencies. However, this should be done with care since shortening the latency of one memory access can worsen the latency of another as a result of shared network resources. Therefore, the goal should focus on narrowing the latency difference of memory accesses. In the paper, we address the goal by proposing to prioritize the memory access packets based on predicting the round-trip routing latencies of memory accesses. The communication distance and the number of the occupied items in the buffers in the remaining routing path are used to predict the round-trip latency of a memory access. The predicted round-trip routing latency is used as the base to arbitrate the memory access packets so that the memory access with potential high latency can be transferred as early and fast as possible, thus equalizing the memory access latencies as much as possible. Experiments with varied network sizes and packet injection rates prove that our approach can achieve the goal of memory access equalization and outperforms the classic round-robin arbitration in terms of maximum latency, average latency, and LSD1. In the experiments, the maximum improvement of the maximum latency, the average latency and the LSD are 80%, 14%, and 45% respectively.
  •  
2.
  • Lu, Zhonghai, et al. (författare)
  • xMAS-Based QoS Analysis Methodology
  • 2018
  • Ingår i: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems. - : IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC. - 0278-0070 .- 1937-4151. ; 37:2, s. 364-377
  • Tidskriftsartikel (refereegranskat)abstract
    • On-chip communication system design starting from a high-level model can facilitate formal verification of system properties, such as safety and deadlock freedom. Yet, analyzing its quality-of-service (QoS) property, in our context, per-flow delay bound, is an open challenge. Based on executable micro-architectural specification (xMAS) which is a formal framework modeling communication fabrics, we first present how to model a classic input-queuing virtual channel router using the xMAS primitives and then a QoS analysis methodology using network calculus (NC). Thanks to the precise semantics of the xMAS primitives, the router can be modeled in different variants, which cannot be otherwise captured by normal ad hoc box diagrams. The analysis methodology consists of three steps: 1) given network and flow knowledge, we first create a well-defined precise xMAS model for a specific application on a concrete on-chip network; 2) the specific xMAS model is then mapped to an NC graph (NCG) following a set of mapping rules; and 3) finally, existing QoS analysis techniques can be applied to analyze the NCG to obtain end-to-end delay bound per flow. We also show how to apply the technique to a typical all-to-one communication pattern on a binary-tree network and conduct an SoC case study, exemplifying the step-by-step analysis procedure and discussing the tightness of the results.
  •  
3.
  • Saggio, Alberto, et al. (författare)
  • Validating delay bounds in networks on chip : Tightness and pitfalls
  • 2015
  • Ingår i: Proceedings of IEEE Computer Society Annual Symposium on VLSI, ISVLSI. - : Institute of Electrical and Electronics Engineers (IEEE). ; , s. 404-409
  • Konferensbidrag (refereegranskat)abstract
    • Analytical methods for estimating on-chip network performance can be very useful to accelerate and simplify the design process of Networks on Chip. However, in order to increase the confidence in these approaches it is fundamental to perform systematic studies that assess their potential. We present a methodical investigation on the tightness between analytical end-to-end delay bounds and worst-case simulation latencies in various scenarios. We first introduce our network calculus based analytical technique to derive per-flow communication delay bounds. Then, we examine the worst-case performance analysis process in NoCs outlining the major aspects that affect the tightness. Finally, experimental results confirm our deductions and allow us to provide general guidelines to avoid pitfalls in the validation process of analytical delay bounds.
  •  
4.
  • Zhao, Xueqian, et al. (författare)
  • A Tool for xMAS-Based Modeling and Analysis of Communication Fabrics in Simulink
  • 2017
  • Ingår i: ACM Transactions on Modeling and Computer Simulation. - : ASSOC COMPUTING MACHINERY. - 1049-3301 .- 1558-1195. ; 27:3
  • Tidskriftsartikel (refereegranskat)abstract
    • The eXecutable Micro-Architectural Specification (xMAS) language developed in recent years finds an effective way to model on-chip communication fabrics and enables performance-bound analysis with network calculus at the micro-architectural level. For network-on-Chip (NoC) performance analysis, model validation is essential to ensure correctness and accuracy. In order to facilitate the xMAS modeling and corresponding analysis validation, this work presents a unified platform based on xMAS in Simulink. The platform provides a friendly graphical user interface for xMAS modeling and parameter setup by taking advantages of the Simulink modeling environment. The regulator and latency-rate sever are added to the xMAS primitive set to support typical flow and service behaviors. Hierarchical model build-up and Verilog-HDL code generation are essentially supported to manage complex models and to conduct cycle-accurate bit-accurate simulations. Based on the generated simulation models of xMAS, this tool is applied to evaluate the tightness of analytical delay bound results. We demonstrate the application as well as the work flow of the xMAS tool through a two-agent communication example and an all-to-one communication example with a tree topology.
  •  
5.
  • Zhao, Xueqian, et al. (författare)
  • Backlog bound analysis for virtual-channel routers
  • 2015
  • Ingår i: 2015 IEEE Computer Society Annual Symposium on VLSI. - : Institute of Electrical and Electronics Engineers (IEEE). - 9781479987191 ; , s. 422-427
  • Konferensbidrag (refereegranskat)abstract
    • Backlog bound analysis is crucial for predicting buffer sizing boundary in on-chip virtual-channel routers. However, the complicated resource contention among traffic flows makes the analysis difficult. Because conventional simulation-based approaches are generally incapable of investigating the worst-case scenarios for the backlog bounds, we propose a formal analysis technique. We identify basic buffer use scenarios and propose corresponding analysis models to formally deduce per-buffer backlog bound using network calculus. A topology independent analysis technique is developed to convey the per-buffer backlog bound analysis step by step. We further develop an algorithm to automate the analysis procedure with polynomial complexity. A case study shows how to apply the technique and the analytical bounds are tight.
  •  
6.
  • Zhao, Xueqian, et al. (författare)
  • Empowering study of delay bound tightness with simulated annealing
  • 2014
  • Ingår i: Proceedings -Design, Automation and Test in Europe, DATE. - 9783981537024
  • Konferensbidrag (refereegranskat)abstract
    • Studying the delay bound tightness typically takes a practical approach by comparing simulated results against analytic results. However, this is often a manual process whereas many simulation parameters have to be configured before the simulations run. This is a tedious and time-consuming process. We propose a technique to automate this process by using a simulated annealing approach. We formulate the problem as an online optimization problem, and embed a simulated annealing algorithm in the simulation environment to guide the search of configuration parameters which give good tightness results. This is a fully automated procedure and thus provide a promising path to automatic design space exploration in similar contexts. Experiment results of an all-to-one communication network with large searching space and complicated constraints illustrate the effectiveness of our method.
  •  
7.
  • Zhao, Xueqian, et al. (författare)
  • Heuristics-Aided Tightness Evaluation of Analytical Bounds in Networks-on-Chip
  • 2015
  • Ingår i: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems. - 0278-0070 .- 1937-4151. ; 34:6, s. 986-999
  • Tidskriftsartikel (refereegranskat)abstract
    • Studying the tightness of analytical delay and backlog bounds is critical for network-on-chip designs, since formal analysis predicts the boundary of communication delay and buffer dimensioning. However, this evaluation process is often a tedious, time-consuming, and manual simulation process whereas many simulation parameters have to be configured before the simulations run. We formulate the tightness evaluation as constrained optimization problems for delay bound and backlog bounds, respectively. The well-defined problems enable a fully automated configuration searching process, which can be guided by a heuristic algorithm with cycle-accurate simulations integrated. This is a fully automated procedure and thus provides a promising path to automatic design space exploration in similar contexts. Experimental results over various topologies and traffic patterns indicate that our method is effective in finding the configuration for best tightness up to 98%, even when up to 50 parameters are configured in a multidimensional discrete search space under complex constraints.
  •  
8.
  • Zhao, Xueqian, 1986- (författare)
  • Network on Chip : Performance Bound and Tightness
  • 2015
  • Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • Featured with good scalability, modularity and large bandwidth, Network-on-Chip (NoC) has been widely applied in manycore Chip Multiprocessor (CMP) and Multiprocessor System-on-Chip (MPSoC) architectures. The provision of guaranteed service emerges as an important NoC design problem due to the application requirements in Quality-of-Service (QoS).Formal analysis of performance bounds plays a critical role in ensuring guaranteed service of NoC by giving insights into how the design parameters impact the network performance. The study in this thesis proposes analysis methods for delay and backlog bounds with Network Calculus (NC). Based on xMAS (eXecutable Micro-Architectural Specification), a formal framework to model communication fabrics, the delay bound analysis procedure is presented using NC. The micro-architectural xMAS representation of a canonical on-chip router is proposed with both the data flow and control flow well captured. Furthermore, a well-defined xMAS model for a specific application on an NoC can be created with network and flow knowledge and then be mapped to corresponding NC analysis model for end-to-end delay bound calculation. The xMAS model effectively bridges the gap between the informal NoC micro-architecture and the formal analysis model. Besides delay bound, the analysis of backlog bound is also crucial for predicting buffer dimensioning boundary in on-chip Virtual Channel (VC) routers. In this thesis, basic buffer use cases are identified with corresponding analysis models proposed so as to decompose the complex flow contention in a network. Then we develop a topology independent analysis technique to convey the backlog bound analysis step by step. Algorithms are developed to automate this analysis procedure.Accompanying the analysis of performance bounds, tightness evaluation is an essential step to ensure the validity of the analysis models. However, this evaluation process is often a tedious, time-consuming, and manual simulation process in which many simulation parameters may have to be configured before the simulations run. In this thesis, we develop a heuristics aided tightness evaluation method for the analytical delay and backlog bounds. The tightness evaluation is abstracted as constrained optimization problems with the objectives formulated as implicit functions with respect to the system parameters. Based on the well-defined problems, heuristics can be applied to guide a fully automated configuration searching process which incorporates cycle-accurate bit-accurate simulations. As an example of heuristics, Adaptive Simulated Annealing (ASA) is adopted to guide the search in the configuration space. Experiment results indicate that the performance analysis models based on NC give tight results which are effectively found by the heuristics aided evaluation process even the model has a multidimensional discrete search space and complex constraints.In order to facilitate xMAS modeling and corresponding validation of the performance analysis models, the thesis presents an xMAS tool developed in Simulink. It provides a friendly graphical interface for xMAS modeling and parameter configuring based on the powerful Simulink modeling environment. Hierarchical model build-up and Verilog-HDL code generation are essentially supported to manage complex models and conduct simulations. Attributed to the synthesizable xMAS library and the good extendibility, this xMAS tool has promising use in application specific NoC design based on the xMAS components.
  •  
9.
  • Zhao, Xueqian, et al. (författare)
  • Per-flow delay bound analysis based on a formalized microarchitectural model
  • 2013
  • Ingår i: 2013 7th IEEE/ACM International Symposium on Networks-on-Chip, NoCS 2013. - : IEEE. - 9781467364928 ; , s. 6558411-
  • Konferensbidrag (refereegranskat)abstract
    • System design starting from high level models can facilitate formal verification of system properties, such as safety and deadlock freedom. Yet, analyzing their QoS property, in our context, per-flow delay bound, is an open challenge. Based on xMAS (eXecutable Micro-Architectural Specification), a formal framework modeling communication fabrics, we present a QoS analysis procedure using network calculus. Given network and flow knowledge, we first create a well-defined xMAS model for a specific application on a concrete on-chip network. Then the specific xMAS model can be mapped to its network calculus analysis model for which existing QoS analysis techniques can be applied to compute end-to-end delay bound per flow. We give an example to show the step-by-step analysis procedure and discuss the tightness of the results.
  •  
Skapa referenser, mejla, bekava och länka
  • Resultat 1-9 av 9

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy